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Abstract 

The family tree of a Galton- Watson branching process may contain iV-ary subtrees, 
i.e. subtrees whose vertices have at least N > 1 children. For family trees without infinite 
iV-ary subtrees, we study how fast iV-ary subtrees of height t disappear as t — > oo. 
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1 Introduction, Statement of the Results and Related 
Studies 

The family tree associated with a Bieneime-Galton- Watson process describes the evolution of 
a population in which each individual, independently of the others, creates k new individuals 
with probability pk (k = 0, 1, ...). We assume that at generation zero there is single ancestor, 
called root of the tree and let 

oo 

f(s) = T,p^ k (i) 

k=0 

to denote the probability generating function (pgf) of the offspring distribution (with the 
convention that /(l) = 1). We recall the well known construction of a Galton- Watson family 
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tree noting that the individuals participating in the process and the parent-child relations 
between them define the vertex-set and arc-set of the tree, respectively (for more details see 
e.g. Harris (1963, Ch. 6). By {Z t ,t = 0, 1, ...} we denote the generation size process, defined 
by the following recurrence: 

*o = i,^ = (f 1 + - + ^ * z r' ( !;i i2 ( 2 ) 



if Z t = 0, 

Here {Xj, j = 1,2,...} are independent copies of a random variable whose pgf is given by (1). 
In terms of trees {Z t , t — 0,1, ...} present the sizes of the strata on the Galton- Watson family 
tree. That is, Z t equals the number of vertices which are at distance t from the root. (We 
recall that the distance between two vertices in a tree is determined by the number of arcs in 
the path between them.) 

The probabilities P(Z t > 0) are often called survival probabilities of the process. Their 
asymptotic behavior, as t — > oo, has been studied long ago by several authors. Let 

7i = lim P(Z t = 0). (3) 

t— >oo 

It turns out that the parameters 

ai = /'( 7 i) A = f'(7i) (4) 

play important role in the study of survival probabilities of the process. (If 71 = 1, here and 
further on, by f'(l), f"(l) and f"'(l) we denote the left derivatives of the power series (1) at 
the point 1; also note that f'{l) is the mean value of the offspring distribution and /"(I) is 
its second factorial moment.) The following two results are well known and valid for offspring 
distributions satisfying the inequality po + pi < 1. 

Result 1. [See Harris (1963, Ch. 1, Thms. 6. Land 8.4).] (i) If f'(l) > 1 and 71 > 0, then 
< ai < 1. (ii) For < a ± < 1, 

P(Z t > 0) = 1 - 71 + d ia \ + O(af), (5) 

as t — > 00, where d\ > is certain constant. 

Result 2. [See Hams (1963, Ch. 1, Thm. 6.1 and Sect. 10.2).] (i) If f'{l) = 1, then 
71 = 1. (ii) If /' (1) = 1 and f"'(l) < 00, then 

P(Z t >0)~— ,t^oo. (6) 

0\t 

Remark 1. Result 2 was first obtained by Kolmogorov (1938). It is also valid if /"(l) < 00; 
see e.g. Sevast'ymiov (1971, Ch. 2, Sect. 2). 

We will study special kinds of subtrees of a Galton- Watson family tree. We will consider 
only rooted subtrees and call two such subtrees disjoint if they do not have a common vertex 
different from the root. Next, for fixed integer N > 1, we define a complete infinite iV-ary 
tree to be the family tree of a deterministic branching process with offspring pgf f(s) = s N . 
For a branching process {Z t , t — 0,1, ...} defined by (2), we introduce the random variable Vjv, 
equal to the number of complete disjoint and infinite iV-ary subtrees rooted at the ancestor. 
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If we restrict the process up to its tth generation (t = 0, 1, ...), or, equivalently, if we assume 
that the Galton- Watson family tree is cut off at height t, we can similarly define the random 
variable Vnj to be the count of the complete disjoint iV-ary subtrees of height at least t, which 
are rooted at the ancestor. It is clear that Vn = limt-*oc V/v,t with probability 1. 

Further on we will assume that the offspring distribution {pt,k = 0,1,...} is such that 
p k < 1 for all k and p k > for some k > N. For N > 1, we also let 

lN,o = 0, 7 iv, t = P(V N , t = 0),t = 1,2, (7) 

and 

lN = P(V N = 0) = lim lNtt . (8) 

t — >oo 

The last limit exists since the probabilities 7jv,< monotonically increase in t. 

In particular, for N — 1, the event {Vi > 0} implies that the family tree contains an 
infinite unary subtree (infinite path), which means that the generations of the Galton- Watson 
process never die. In the same way, event {V 2 > 0} can be interpreted as the set of trajectories 
of the process whose family trees grow faster than binary splitting. 

Another important observation follows from the well known extinction criterion; Harris 
(1963, Ch. 1, Thm. 6.1). We give it here in terms of the pgf (1) and probabilities (8) as 
follows: a necessary and sufficient condition for 7! = 1 is /'(l) < 1; if /'(l) > 1, then 7 X is 
the unique solution in [0, 1] of the equation 

s = f(s). (9) 

This enables one to restate Results l(ii) and 2(ii) in terms of counts of unary subtrees (recall 
(3) - (6) and Remark 1). 
Result V . If a\ < 1, then 

P(Vi >t > | K = 0) = d ia \ + 0{af) 

as t — > oo, where d\ > is certain constant. 
Result 2'. If ai = 1 and /"(l) < oo, then 

P(K,t>0|K = 0)~^-,f->oo. 

Oil 

The main purpose of this present note is to study the survival of complete iV-ary subtrees 
on a Galton- Watson family tree. We extend Results 1' and 2' to integer values of N greater 
than 1. Below we give the brief history of this problem. 

The question how to compute the probability that the Galton- Watson process possesses 
"the binary splitting property" was first raised, settled and solved by Dekking (1991). The 
general (N > 2) case was subsequently investigated by Pakes and Dekking (1991), who showed 
that the probability 7jv, defined by (8), is the smallest solution in [0, 1] of the equation 

s = g N (s), (10) 

where 

Ms) = E(i - syf U) (*W- (ii) 

3=0 
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and f(s) is the offspring pgf (f). We also point out that a particular case arising from a 
study of Mandelbrot's percolation process was previously considered by Chayes et al. (1988). 
Their problem is equivalent to finding a condition on p for 73 < 1 when f(s) = (1 — p + 
ps) 9 . Furthermore, note that gi(s) = f(s) and thus, for N = 1, eq. (10) reduces to (9). 
For particular offspring distributions, Pakes and Dekking (1991) encountered the following 
phenomenon: if N > 2, then there is a critical value m N for the offspring mean f'(l) such 
that 7jv = 1 if /'(I) < m N an d In < 1 if f'iX) > m N- Let 7^ be the critical probability 
obtained when f'(l) = m° N . It turns out, for instance, that if N = 2 and the offspring 
distribution is geometric, then m c 2 = 4 and 72 = .75; for a Poisson offspring distribution the 
same parameters are: m c 2 = 3.3509, 7! = .4648. (Further numerical results in this direction can 
be found in Pakes and Dekking (1991) and Yanev and Mutafchiev (2006).) This phenomenon is 
qualitatively different from what happens for N = 1 where the extinction probability r y 1 — 1 
if /'(I) — m i — 1; except for the trivial case where f(s) = s and 71 < 1 if /'(l) > 1. 
The case N > 2 seems to be studied surprisingly later than the classical one when N = 

1. In fact, first assertions of the fundamental theorem on the existence of infinite unary 
subtrees on a Galton- Watson family tree appeared about 120 - 150 years earlier and the 
problem was definitely settled around 1930. For more historical details, see e.g. Harris (1963) 
and Sevastyanov (1971). Recently, Yanev and Mutafchiev (2006) derived the probability 
distributions of the random variables Vnj and Vn. The result for Vnj is given in a form of 
recurrence. Furthermore, the expression for the probability distribution of Vn turns out to 
be very simple: its probability mass function equals the difference between two particular 
neighbor partial sums of the Taylor's expansion of /(l) around the point 7^. 

To state our main results in an appropriate form, we extend notations (4) to integer values 
of iV > 2. We set 

On = 9n(1n), b N = 9n(1n) (12) 
and also recall definitions (7), (8) and (11). 

Theorem 1 IfjN £ (0, 1) is the smallest solution of eq. (10), then, 

(i) for N > 2, we have < 1. 

(ii) If a N < 1, then 

P(V N , t >0\V N = 0) = d N a N + 0{a N ) 

as t — > 00 , where > is certain constant. 
(Hi) If on = 1, then 

(iiia) 6jv > 0, and, 

(iiib) for N > 2 and finite b^, 

P{V N)t >0\V N = 0) — ,t -> 00. 

iNONt 

Our paper is organized as follows. The proofs of the results are presented in next Section 

2. We recall there some old and classical methods used in the theory of branching processes. 
Section 3 contains few numerical results for particular offspring distributions. 
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We conclude our introduction with a remark on studies which are closely related to our 
model. 

Remark 2. Pakes and Dekking (1991) noticed that there are links between complete infinite 
iV-ary subtrees on a Galton- Watson family tree, Mandelbrot's percolation process studied by 
by Chayes et al. (1988) and results obtained by Pemantle (1988) and related to a model of a 
reinforced random walk. In particular, Pemantle (1988) established the following criterion for 
7jv < 1 (see his Lemma 5 or Pakes and Dekking (1991, pp. 356-357)): if for some So G (0, 1) 
we have <7at(<So) < s 0j then 7^ < Sq. Here we also indicate a relationship between the iV-ary 
subtrees phenomenon and the existence of a /c-core in a random graph. The k-core of a graph 
is the largest subgraph with minimum degree at least k. This concept was introduced by 
Bollobas (1984) in the context of finding large /c-connected subgraphs of random graphs. He 
considered the Erdos-Renyi random graph G(n,p) with n vertices in which the possible arcs 
are present independently, each with probability p. If we set p = X/n, where A > is a 
constant, it is natural to ask: for k > 3, what is the critical value X c (k) of A above which a 
(non-empty) k-core first appears in G(n,X/n) with probability tending to 1 as n — > 00. To 
answer this question Pittel et al. (1996) considered a Galton- Watson family tree rooted at a 
vertex rr (ancestor) of the graph G(n, X/n) and assume that the offspring distribution of the 
branching process is Poisson with mean A. Let denote the event that Xq has at least k 
children each of which has at least k — 1 children each of which has at least k — 1 children, 
and so on. It is clear that this assumption slightly modifies the concept of a complete infinite 
(k — l)-ary subtree (the only difference occurs in the assumption for the offspring number 
of the ancestor xq). Pittel et al. (1996) found the threshold X c (k) for the emergence of a 
non-trivial /c-core in G(n,X/n) and showed that, except at the critical value, the number 
of vertices in the /c-core approaches P(B k )n as n — > 00. Their results also showed that a 
giant /c-core appears suddenly when the number of arcs in the random graph reaches Cfcn/2, 
where the constants are explicitly computed. There is a remarkable coincidence between 
constants c& and the critical means m'j t _ l (k = 3,4,5) of the Poisson offspring distributions 
which yield existence of (k — l)-ary subtree on a Galton- Watson family tree given by Yanev 
and Mutafchiev (2006, p. 232). The idea of embedding a Poisson branching process in the 
random graph model was recently developed by Riordan (2007) who gave a new proof of the 
results of Pittel et al. (1996) and extended them to a general model of inhomogeneous random 
graphs with independence between their arcs. 



2 Proofs of the Results 

First, we recall Pakes and Dekking (1991) result: the probability 7^, defined by (8), is the 
smallest solution in [0, 1] of eq. (10). To prove part (i) of the theorem, note that 7^ > implies 
that <7jv(0) > 0. Therefore, for s G [0, 7jv), the graph of the function y = Qn{s) lies above the 
diagonal of the unit square in the coordinate system sOy. At s = j N the curve y = <7jv(s) 
crosses or touches the diagonal y = s. If it touches it, then = <7w(7jv) = 1- If V — Qn{s) 
crosses the diagonal, then, for some sufficiently small e G (0, 7^), we have Qn^In — e) > In — e 
and gN^lN + e ) < In + e - Hence Qn^In + e ) — 9n(1n — e ) < 2e. Therefore the derivative 

g' N (s) = (l-sf- 1 f^(s)/(N-l)\ 
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should not exceed 1 for certain s = s e G (jn — e, In + e), by the mean value theorem. Letting 
e — > and using the continuity of g'^{s), we get assertion (i). 

Assertion (ii) can be obtained using a result on iterations of functions due to Koenigs 
(1884) (see also Harris (1963, Ch. 1, Sect. 8.3)). Below we state a suitable modification of it 
as a separate lemma. The proof follows the same line of reasoning as in Harris (1963, Ch. 1, 
Thm. 8.4). 

Lemma 1 Let 

oo 

H s ) = h 3 s3 

3=0 

(hj real) be a function, which is analytic in \ s |< 1, strictly increasing in [0, 1] and such that 
h(l) = 1. Let 

h {s), h^s) = h(s), h t+l = h(h t (s)),t = 1, 2, ... (13) 
be the sequence of iterations ofh(s). Suppose that the equation 

s = h(s) (14) 

has a solution in [0, 1] and let q be the least one in [0, 1]. If q satisfies h'(q) < 1, then 

hti 0) = q - d [ h ' {q )Y + O ([h'(q)] 2t ) 

as t — > oo, where d > denotes an absolute constant. 

We will apply Lemma 1 setting h(s) = <7jv(s). Define the iterations gN,t( s ) of the function 
<7jv(s) as i n (13). Also, recall that #at(1) = 1 and 7^,0 = (see definitions (11) and (7), 
respectively). We set q = 7^ in eq. (14). Then, we use the recurrence 7JV,t — 9NiriN,t— i)j see 
Yanev and Mutafchiev (2006, p. 227). Iterating t times as in (13), we get 7^ = P(V/v,t = 
0) = <7jv,<(0). Hence, by Lemma 1 and notation (12i), 

lN,t = 1n - d'jya^ + O(o%) 

as t — > 00, where d' N > denotes an absolute constant. Dividing both sides of this equality by 
7jv and writing conditional probabilities for Vnj, w e obtain assertion (ii) with (ijv = d' N /^N- 

To prove (iiia), let us assume that b N < (see notation (12 2 )). This shows that g' N (s) 
decreases in a neighborhood of s = 7jv. Hence, there exists a sufficiently small number 
5 > such that, for any s G (7^ — S, tjv], we have g'jy{s) > (7^(7^) = = 1. Therefore, 
[^jv(s) — s]' > 0, and so, the function ^(s) — s increases in (7^ — 8, jn]. Thus, for any 
s G (7a? — SjJn), we have <?at(s) — s < ^jv(7jv) — 7w = 0. Combining the inequalities 
<7jv(s) < s i9n(Q) > and using the continuity of 9n{s), we conclude that there is some 
s o < In that solves eq. (10). This contradicts the assumption that 7^ is the smallest solution 
in (0, 1] of eq. (10). So, (iiia) is proved. 

The asymptotic given in assertion (iiib) will also follow from classical results on iterations 
of analytic functions, increasing on a segment of the real axis. One possible proof may use 
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a general result of Harris (1963, Ch. 1, Lemma 10.1) establishing uniform asymptotics for 
1/[1 — h t (s)], where h t (s) denote the iterations defined by (13) and the complex variable s 
varies in some particular subsets of the unit disc. In our case it suffices, however, to consider 
the behavior of ht(s) only at s = 0. The problem turns out to be similar to that for the 
critical branching process which was studied first by Kolmogorov (1938). The proof of the 
next lemma repeats the arguments given by Sevast'yanov (1971, Ch. 2, Sect. 2). 

Lemma 2 Suppose that h(s), h t (s),t = 0, 1, ... and q are the same as in Lemma 1. Further- 
more, suppose that h'(q) = 1 and h"(q) G (0, oo). Then, we have 

1 th"(q) 



q-h t (0) 

as t — > oo. 



+ o(t) (15) 



To show how assertion (iiib) follows from this lemma we set in both sides of (15): q = 
7at, h(s) = #jv(s), ht(0) = 9N,t(0) = lN,t and h"(q) = 9n{in) = b N . Thus, we obtain 

= — +o(t), *->oo. (16) 



In - lN,t 2 

To complete the proof of (iiib) it remains to take the reciprocal of (16), divide both sides by 
7jv and convert the ratio jN,t/lN into conditional probability for Vjv jt . 



3 Numerical Results 

Geometric distribution. We look at the case, where 



f{s) = - 1 — -,9n(s) = 1 - 
1 — ps 



P(l-*) 
1 — ps 



N 



,0 < p < 1. 



Pakes and Dekking (1991) established in this case that the critical mean for N = 2 is m c 2 = 4 
which implies that the critical value for the parameter p is p c 2 = 4/5. It is easy to see that the 
least solution in [0, 1] of eq. (10) is 7^ = 3/4. Calculating the first two derivatives of #2(5) at 
s = 3/4, we get a\ — 1, b\ — 2. Therefore, by assertion (iiib) of Theorem 1, 

p(v 2it >o|y 2 = o)~^,*->oo. 

Poisson distribution. The Poisson offspring distribution has the pgf f(s) = e m<yS ~ l \ m > 0. 
Whence 

N-l 

9n(s) = e m ^ J2 [(1 " s)mY/j\. 

3=0 

In this case m c 2 = 3.3509 and the least solution in [0, 1] of eq. (10) is 72 = .4648 (see Yanev 
and Mutafchiev (2006)). Numerical computations with greater level of accuracy show that 
a c 2 — 1,&2 — 1-48235 and by Theorem l(iiib), 

2 9028 

P(V 2 , t > I V 2 = 0) ~ — — ,t ^ 00. 

6 
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One-or-many distribution. This is a two-parameter family of discrete distributions defined 
for some p G (0, 1) and integer r > N > 1 bj the equalities: p r — p,p 1 = 1 — p. Clearly, 
f(s) = (1 — p)s + ps r , and hence 

9N (s) = i-pj2 ( r )(i- s y s r -i. 

j=N V/ 

Pakes and Dekking (1991) showed that if r = N + 1, then 7^ = 1/N 2 and the threshold 
value of the parameter p is p c N = (1 — 1/N)(1 — 1/N 2 )~ N . For r = 3 and A^ = 2, we have 
g' 2 {s) = Qps(l — s),p 2 = 8/9,72 = V^- Thus, we get a 2 — l,b 2 — 8/3, and hence by Theorem 
l(iiib), 

p(y 2tt >o\v 2 = o) = *,t^Kx>. 
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